Inducing phonetic distances from dialect variation

نویسندگان

  • Martijn Wieling
  • Eliza Margaretha
  • John Nerbonne
چکیده

In this study we attempt to derive phonetic distances from alternative dialectal pronunciations used in different geographical varieties. We use two dialect atlases each containing the phonetic transcriptions of the same set of words at hundreds of sites. We collect the sound correspondences through alignment with the Levenshtein distance algorithm, and then apply an information-theoretic measure, pointwise mutual information, assigning smaller segment distances to segments which frequently correspond. We iterate alignment and information-theoretic distance assignment until both stabilize and we evaluate the quality of the phonetic distances obtained by comparing them to acoustic vowel distances. For both Dutch and German, we find strong correlations between the induced phonetic distances and the acoustic distances, illustrating the usefulness of the method in deriving valid phonetic distances from dialectal pronunciations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inducing a measure of phonetic similarity from pronunciation variation

Structuralists famously observed that language is ”un systême oû tout se tient” (Meillet, 1903, p. 407), insisting that the system of relations of linguistic units was more important than their concrete content. This study attempts to derive content from relations, in particular phonetic (acoustic) content from the distribution of alternative pronunciations used in different geographical variet...

متن کامل

Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data

The Levenshtein dialect distance method has proven to be a successful method for measuring phonetic distances between Dutch dialects. The aim of the present investigation is to validate the Levenshtein dialect distance with perceptual data from a language area other than the Dutch, namely Norway. We calculate the correlation between the Levenshtein distances and the distances between 15 Norwegi...

متن کامل

Comparison and Classification of Dialects

This project measures and classifies language variation. In contrast to earlier dialectology, we seek a comprehensive characterization of (potentially gradual) differences between dialects, rather than a geographic delineation of (discrete) features of individual words or pronunciations. More general characterizations of dialect differences then become available. We measure phonetic (un)related...

متن کامل

Exploring Dialect Phonetic Variation Using PARAFAC

In this paper we apply the multi-way decomposition method PARAFAC in order to detect the most prominent sound changes in dialect variation. We investigate various phonetic patterns, both in stressed and unstressed syllables. We proceed from regular sound correspondences which are automatically extracted from the aligned transcriptions and analyzed using PARAFAC. This enables us to analyze simul...

متن کامل

Measuring Norwegian dialect distances using acoustic features

Computational dialectometry has been proven to be useful for finding dialect relationships and identifying dialect areas. The first to develop a method of measuring dialect distances was Jean Séguy, assisted and inspired by Henri Guiter (Chambers and Trudgill, 1998). Strongly related to the methodology of Séguy is the work of Goebl, although the basis of Goebl’s work was developed mainly in dep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011